Levy 


Irrelevance in Problem Solving 

Alon Y. Levy 

Knowledge Systems Laboratory 
Stanford University 

701 Welch Road, Bldg. C, Palo Alto, CA 94304 
alevy@cs.stanford.edu 


Abstract 

The notion of irrelevance underlies many different 
works in AI, such as detecting redundant facts, 
creating abstraction hierarchies and reformulation 
and modeling physical devices. However, in order 
to design problem solvers that exploit the notion 
of irrelevance, either by automatically detecting 
irrelevance or by being given knowledge about ir- 
relevance, a formal treatment of the notion is re- 
quired. 

In this paper we present a general framework for 
analyzing irrelevance. We discuss several prop- 
erties of irrelevance and show how they vary in 
a space of definitions outlined by the framework. 
We show how irrelevance claims can be used to 
justify the creation of abstractions thereby sug- 
gesting a new view on the work on abstraction. 

Introduction 

Meta-level reasoning has received a lot of attention 
from researchers in artificial intelligence as a means 
of guiding problem solvers in their search for solu- 
tions [Hayes, 1973; Genesereth, 1988; Smith and Gene- 
sereth, 1985; Clancey, 1983]. A common of meta- 
levei strategy is to avoid using knowledge that is ir- 
relevant to the goal at hand. In fact, the notion of 
irrelevance has been a common theme in many re- 
search works, but its formal analysis has received at- 
tention only from few researchers such as Subramanian 
and Genesereth [Subramanian and Genesereth, 1987; 
Subramanian, 1989]. The ability to give a problem 
solver advice about what parts of a knowledge base are 
irrelevant to a specific problem solving goal is a power- 
ful method to reduce its search. For example, consider 
a domain in which we are trying to find routes between 
cities in the country, using flights, trains and busses. 
For some goals, we might want to advise the problem 
solver that rules and facts about flights are irrelevant, 
either because the minimal price of flights is known to 
be greater than is required for the specific goal or be- 
cause we know that flights will not yield an optimal 
solution. By giving this advice, we significantly reduce 


the size of the search space explored by the problem 
solver. 

The notion of irrelevance also plays a key role in 
work on abstractions and change of representation. In- 
tuitively, when we want to create a simpler or abstract 
representation we remove some irrelevant detail. If the 
removed detail was indeed irrelevant then the solution 
to the problem in the abstract theory will map back to 
a solution in the original theory. Therefore, if we can 
provide the system with knowledge about irrelevance 
or relative irrelevance of knowledge, the system can 
exploit it to automatically create abstractions. Meth- 
ods for mechanically detecting relevance can be used 
to automatically create abstractions. 

However, both in order for a user to be able to 
state such claims to a system in a principled man- 
ner and for the system to make proper use of given 
claims, a better analysis of the notion of irrelevance 
in problem solving is required. This paper describes 
a general framework for analyzing the notion of irrel- 
evance. We define a space of possible definitions of 
irrelevance by identifying several axes along which ir- 
relevance claims differ. Several important properties of 
irrelevance concerning their usage in problem-solving 
are outlined and we show how varying the definition 
of irrelevance in our space affects the satisfaction of 
these properties. Next, we discuss how irrelevance 
claims can serve as justifications for creating an ab- 
straction. The case of irrelevance of a distinction be- 
tween properties (represented as predicates) is exam- 
ined in detail and we show how such a claim serves as a 
justification for predicate abstraction [Plaisted, 1981; 
Tenenberg, 1987]. 

This framework makes several contributions. First, 
it clarifies the issues involved in the notion of irrele- 
vance therefore enabling us to better exploit the notion 
in works that rely on it, such as the work on detect- 
ing redundant facts or creating abstraction hierarchies. 
The properties of irrelevance that we outline provide 
guidance in building a system that incorporates such 
claims. Giving precise definitions of irrelevance for- 
malizes the problem of automatically deducing irrele- 
vance facts, thereby enabling us to automatically cre- 
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ate abstractions, based on deduced irrelevance claims. 
Moreover, since our framework provides a language to 
express knowledge about irrelevance, we can use this 
language to express knowledge about the domain that 
can help reduce the size of the search or justify creating 
an abstraction. 

Preliminaries 

Assume our theory of the domain is represented by a 
knowledge base of first order predicate calculus formu- 
las, A. A problem solving goal (or query) is repre- 
sented by a formula ip. The goal is to find whether ip 
is implied by A (or if ip has free variables, we want 
to know which assignments to the variables result in 
a formula that is entailed from A ). Our aim is to 
identify facts that are irrelevant to ip in order to re- 
duce the search space generated for ip. Formalizing 
the concept of irrelevance can be done in several levels. 
For example, one can formalize irrelevance in terms of 
the models of A and ip, i.e., a semantic level analy- 
sis. Irrelevance can also be analyzed in terms of the 
facts in the theory A, a so called meia-theoretic analy- 
sis [Subramanian, 1989]. Alternatively, one can give a 
proof-theoretic analysis of irrelevance, in terms of the 
actual set of derivations the problem solver can explore 
in the search to solve ip. Although these levels are by 
no means independent, it is important to distinguish 
between them when defining irrelevance or comparing 
between definitions. 

The goal of this paper is to define notions of irrele- 
vance that enable us to optimize actual problem solv- 
ing. Therefore, we analyze irrelevance from the sys- 
tem’s view of the problem-solving process which is a 
proof- theoretic one. The system does not actually see 
the world as the user sees it nor does it see the con- 
ceptualization of the world. Instead, it sees the set of 
symbols used to describe the domain and the set of 
derivations it can generate. 

Example 1: Suppose we are using a resolution theo- 
rem prover on a knowledge base in clause form 1 . Con- 
sider the following two theories: 

T i = {/ => 9> “V => 9} 
r 2 = {g}. 

T\ and T 2 are satisfied by the same set of models. In 
each the value assigned to / does not affect the value 
of g , and therefore we might consider / to be irrelevant 
to g. However, in 7\, the theorem prover will have to 
resolve on the symbol / to derive g , and therefore as 
far as it is concerned, it can’t ignore the symbol /. | 

Note that we are not claiming that irrelevance rela- 
tions in the domain are not useful to control problem 
solving; quite the contrary. Most irrelevance facts are 

1 For clarity, in this document we do not use clause form 
notation but assume the problem solver gets formulas in 
clause form. 


based on properties of the domain. However, a rele- 
vance relation in the domain will only be useful if it is 
reflected in the representation. 

In particular, for a problem solver to exploit irrel- 
evance claims, the following properties of irrelevance 
claims will be of interest. Assume IR(<p, ^,A) denotes 
that the fact (or set of facts) <p is irrelevant to the goal 
ip with respect to the theory A. 

• What can the problem solver do given the irrele- 
vance claim? Can it ignore a fact that is deemed 
irrelevant? Can it ignore any fact that contains it as 
a subexpression? 

• Do irrelevance claims add up? If IR(<p \ , ip, A) 
and IR(<t>2,iPi A) hold, does that imply that 

^ 2 }, A) holds? If so, we can use all the 
relevance claims that are available to us at a given 
instant. However, if not, we can only use one at a 
time, and then we must check that the others still 
hold in the resulting theory. 

• Is irrelevance a monotonic property? I.e., if we add 
more facts to the knowledge base, can irrelevant facts 
become relevant or vice versa? 

• Does the irrelevance of a subject imply the irrele- 
vance of a subject which is syntactically related to 
it? E.g., Does IR(<p } ip, A) imply IR(-up y ip, A) or 
IR(<pV <pi, ip, A)? Such properties will enable us use 
a given set of irrelevance claims to deduce additional 
ones. 

• Can irrelevance claims be found automatically by 
examining the KB? 

An important issue in a definition of irrelevance is 
the subject of irrelevance, i.e., the type of entity being 
deemed irrelevant to the goal. So far we discussed only 
the irrelevance of a fact (or set of facts) to a problem 
solving goal, but the subject may be any kind of entity 
in the representation, such as the objects-constants, 
predicate-symbols and functions. The irrelevance sub- 
ject can also be more abstract such as a decision to 
distinguish between a set of predicates or objects in 
the representation. The following is an example of the 
irrelevance of a predicate distinction. 

Example 2 : Consider the knowledge base with the 
following facts. 

ri : SportsCar(x) ^ Vehicle(x) 

r 2 : FamilyCar(x) => V ehicle(x) 

r3 : SportsCar(x) ^ HighRisklnsurance(x) 

r* : FamUyCar(x) => -*SportsCar(z) 

rs : FamilyCar{Camry) 

In order to solve the query Vehicle(x), the distinc- 
tion between the relations SportsCar and FamilyCar 
is irrelevant. Intuitively, all that matters for the proof 
is that x is some kind of car. Therefore, we can re- 
move the distinction in the representation by predicate 
abstraction [Tenenberg, 1987]. We express the theory 
using an abstract predicate, Car , as follows: 
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s : : Car(x) => Vehicle(x) 

52 : Car(Camry) 

r i and r 2 were abstracted to s lt while r 5 was ab- 
stracted to 5 2 . r 3 on the other hand was a rule specific 
to SportsCar, because 
FamiiyCar(x) => //ty/i/fr's*/nsurarice(x) 
does not hold, and therefore we cannot abstract it to 
Car(x) => HighRisklnsurance(x). 

Consequently, it is removed from the theory. Simi- 
larly, 7*4 is a formula that distinguishes between the 
relations F amilyCar and SportsCar and therefore is 
removed from a theory that ignores the distinction be- 
tween these relations. | 

A final issue that factors into a definition of irrele- 
vance is the space of possible changes of the representa- 
tions and the theory (or weakenings of the theory [Sub- 
ramanian and Genesereth, 1987]) we are considering in 
order to remove the irrelevancy. In example 1, we only 
considered changing the theory by removing facts and 
therefore we could not justifiably say that f is irrele- 
vant to g. However, had we considered changing the 
theory by adding some of its logical consequences (e.g., 
g) y we could deem / irrelevant to g. In example 2, the 
irrelevancy was removed by predicate abstraction, i.e., 
replacing the predicates FamilyCar and SportsCar 
by an abstract predicate Car. 

A Space of Irrejevancies 

To capture the various properties of irrelevance we de- 
fine a space of possible definitions of irrelevance. The 
space of definitions revolves around the set of possible 
derivations of the goal. Let A be a knowledge-base, xp 
be a goal and V be the set of derivations of xp from 
A. A definition of irrelevance of <p (which can be any 
irrelevance subject) to ip is composed of the following 
choices: 

Al. Defining irrelevance of <p with respect to a single 

derivation, D £ V. 

A2. A subset X> 0 of V over which to quantify Al. 

A3. The method of quantification over V 0 , i.e., ex- 
istentially or universally. 

Formally, if D is a derivation of a goal tp from a 
knowledge base, A, we denote the choice for Al by 
Ir(<p,\p, D), i.e., that <p is irrelevant to the derivation 
D of the goal ip. If $ is a set of facts, /r($, \ p, D) holds 
if Ir(<pi , xp, D) for all <P% € $. 

Definition 3 : Let Vq be a set of derivations of a 
goal ip from the knowledge base A 2 . <p is said to 
be weakly irrelevant to ip with respect to Z> 0 , de- 
noted by WI(<p, xp, P 0 ) 3 ? if Ir(<p,xp,D) holds for some 

2 If ip is a set of goals (e.g., a goal with free variables) we 
consider a set of derivations for every element of ip. The 
definitions below hold if they hold for every element of i p. 

3 Note that the knowledge base A is implicit in the third 

argument of WI and SI. 


D £ ■ <P is said to be strongly irrelevant , denoted by 

5/(0, ip , V o), if /r(0, 0 , D) holds for every D 6 V 0 . | 

Note that in Definition 3, the knowledge base, A 
does not appear explicitly in WI (SI), but is implicit 
in the set Vo. For every choice of Ir and of Vq, we get 
a definition for weak and strong irrelevance. Except 
for Vo = V, examples of Vo include the set of all min- 
imal derivations 4 , or all derivations bounded by some 
resource constraints. The following example clarifies 
some of these distinctions. 

Example 4: Consider a knowledge base with the fol- 
lowing rules: 
r i * E(x) => Q(x) 
r 2 : R(x ) => Q(x) 
r 3 : P(x) => Q(x) 
r A : E(x) P(x) 

**5 : Q(x) => P(x) 

Q<x) 



Figure 1: Search space for a goal Q(x) 

The knowledge base also contains a set of ground 
facts but only for the predicate E. Figure 1 shows the 
possible derivations that can be generated for Q from 
this theory. Suppose we define /r(r, g, D) to hold if 
the rule r does not appear in the derivation D. Let V 
be the set of all derivations of Q(a) 5 . WI(r 3 , Q(a), V) 
holds since whenever Q(a) is derivable, there will be 
a derivation of Q(a) using only SI(r 2 ,Q(a),V) 
holds because r 2 cannot be part of a proof of Q(a). 
*57(r 5 , Q(a), V) does not hold, however, if we con- 
sider the set of non-redundant derivations P 0 6 , then 
SI(r*.,Q(a),Vo) holds. | 

Irrelevance of a Fact 

In this section we briefly consider the case in which the 
relevance subject is a single fact, and show how vary- 
ing the choices for A1-A3 affects the properties of the 
resulting irrelevance claims. The definitions consider a 
specific problem solver, hence our discussion assumes 
we are using resolution theorem prover. A derivation 

4 Given some criteria of minimality of deductions. 

5 which will be empty if E(a) is not in the knowledge 
base. 

6 A derivation tree is redundant if it has two identical 
nodes n\ and n 2 such that nj is an ancestor of n 2 . 
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is a resolution tree of clauses, where the goal clause 
(or the empty clause in case of a refutation proof) is 
the root, and the children of every clause are the two 
clauses that were resolved in order to get it. The leaves 
of the tree are clauses from the knowledge base, and 
they are denoted by Base(D). 

Consider three choices for Al. In the first, a fact 
is irrelevant to a derivation of the goal if it is not one 
knowledge-base facts used in it: 

Definition 5: 7ri(0, 0, D) if <p £ Base(D). | 

A stronger definition requires that 0 is irrelevant to 
a derivation if it appears nowhere in the derivation: 

Definition 6: /r2(0,0, D) iff there does not exist a 
substitution <r such that 0er is a subclause of a clause 
in D. | 

Subramanian [Subramanian, 1989] defines 0 to be 
irrelevant to 0 with "spect to a theory A, if there is 
a subset of A that mIs 0 but is non-committal on 

0. In our space, we u formalize this a s follows: 

Definition 7 : /r 3 <p,0,Z)) if Base(D) £ <t> and 

Base(D) £ -0. | 

Using /r 3 , for a refutation resolution theorem 
prover, W 7(0,0, P) is equivalent to the definition 
given in [Subramanian, 1989]. 

Figure summarizes the different properties that hold 
for the definitions described above. The following show 
how the properties of weak irrelevance differ from those 
of strong irrelevance. 

Observation 8: Whenever irrelevance adds up on a 
single derivation, it will add up for strong irrelevance, 

1. e., if 

/r($ i,0, D) A /r($ 2 > 0, D) => /r({$i, $ 2 }, g, D) 
hold for any D , then for any choice of Pq, 

S/($, , 0, P) A 5/($2, 0, V) =* SI({$u$ 2 },g,V) 

This property does not hold for weak irrelevance. | 

Observation 9: The converse holds for weak irrele- 
vance too, i.e., whenever 

fr({$i, $ 2 }, 0, D) => Ir($i y ip t D) A Ir($ 2 , 0, D) 

holds for any D , then for any choice of Po, 

W/U^.^J^Po) => WI{* u xP,Vq) a W/(<J> 2 ,0,P o ) 

S/({*i,$2},0,Po) =► S/(*i,0,P o ) A5/(^ 3 ,0,Po) 

I 

Observation 10: For any definition of 7r such that 
/r(0,0,D) => /r 1 (0,0,D), if we add facts to the 
knowledge base, irrelevance can change as follows. A 
fact that was weakly irrelevant will still be weakly ir- 
relevant. A fact that was strongly irrelevant will be 
at least weakly irrelevant. A fact that was not weakly 
irrelevant might become weakly irrelevant. | 
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Pi : W I (0, 0, P) implies that A \ 0 h 0. 

Pi- VV7(0, 0,P) implies that the problem solver 
can ignore any derivation that contains 0. 

P 3 - W7(0,0,P) implies that the problem solver 
can ignore any derivation that contains 
0 as a subexpression. 

Pa : Adding up - 

Ir($i , 0, D) A /r(<$2, 0, D) => 

P5: Transfers through equivalence - 
/r(0 1,0, D) A (0i = 02 ) ^ 

/r(0 2 ,0,P). 

Pf. If 0 is a subclause of 0i , then 
fr(0i, 0, D) => /r(0i,0, D). 

Figure 2: Properties of Irrelevance 


Deducing Irrelevance Claims 

Varying the definition of irrelevance has drastic ef- 
fects on the ability to automatically derive irrelevance 
claims. Given a knowledge base A and a goal 0, we 
would like to derive all (or part of) the facts in A that 
are irrelevant to 0. In general, looking at the whole 
knowledge base to determine irrelevance will be more 
costly than solving the query. A more interesting ques- 
tion is whether irrelevance claims can be derived by 
looking at only a small and stable part of the knowl- 
edge base. For example, in example 4, we were able to 
determine irrelevance by merely looking at the struc- 
ture of the proof space created by the rules, regardless 
of the specific ground facts for the predicate E. 

We examine this question for knowledge bases com- 
prised of a set of Horn rules with no function symbols 
(Datalog, [Ullman, 1989]), and a database of ground 
atomic facts. We distinguish between two sets of pred- 
icates in the knowledge base, the extensional predicates 
(EDB predicates) which are those that appear only 
in the database and in antecedents of rules, and the 
intensional predicates (IDB predicates) which are the 
predicates appearing in the consequents of the rules, 
i.e., the predicates that are being defined by the EDB 
predicates and the rules. A query is an IDB predicate, 
i.e., to find all the derivable facts for that predicate. 
Every derivable instance of the goal has a (perhaps 
more than one) derivation tree. A derivation tree is 
a tree consisting of goal-nodes and rule-nodes. A goal 
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node is labeled by a ground atom, and it has a single 
child, which is an instantiated rule-node. The head 
of an instantiated rule-node is identical to its parent 
goal-node. A rule-node has a child goal-node for each 
one of its subgoals. The leaves of a derivation tree are 
goal-nodes labeled by ground atoms from the EDB. A 
derivation is not minima/ (or redundant) if there are 
two identical goal-nodes nj and no, such nj is an an- 
cestor of n 2 . A rule r is irrelevant to a derivation D 
(i.e., Ir(r, ip,D)) if none of the rule nodes in D are 
instances of r (note that this is equivalent to /r r and 
/r 2 ). 

The question we address is the following. Given a set 
of rules, V, a query q and a definition of irrelevance, 
can we determine whether a rule r e V is irrelevant 
to query for any possible set of ground facts in the 
knowledge base. We consider two choices for A2, the 
set of all derivations of the goal q } denoted by V , and 
the set of all minimal derivations, X> 0 . 

Finding irrelevant rules enables us to significantly 
prune the search space for the query. In exam- 
ple 4, rule r 2 will not appear in any derivation of 
Q, therefore SI(r 2 , Q{x) y V) holds. r 5 will appear 
only in redundant derivations of Q and therefore 
S/(r 5 , Q(z), Vq) holds. Since Q(x) can always be de- 
rived using either n or {r 3 ,r 4 }, both WI(r x , Q(x) y V) 
and W/({r 3 , r 4 }, Q(x), V) hold. Consequently, iden- 
tifying the various kinds of irrelevance can enable us 
to compute Q using only r : . Considering constraint 
literals in the rules enables us to derive additional ir- 
relevance claims: 


Example 11: Consider the following knowledge base: 

si : Q(x y z) A Q\(z, y) A x < z => P(x, y) 
s 2 : Q{z } x) A Qi(r, y) A x < y => P{x,y) 
s 3 : E[(x , jf) A x < 3 => Q(x , y) 
s 4 : E 2 (x t y) A x > 1 => £?i(.r, y) 

If the query is P(x , y), all rules are relevant. However, 
if the query is P(x,y) A (y < 1), then s 2 is strongly 
irrelevant, i.e., S/(s 2 , P(z, y) A (y < 1),2>). I 

Finding all rules which are weakly irrelevant, i.e., 
WI(r, y,Z>), is precisely the rule redundancy problem 
shown to be undecidable by Shmueli [Shmueli, 1987]. 
Consequently, determining WI(r, g,T> 0 ) is also unde- 
cidable. For strong irrelevance, if the rules contain 
no constraint literals and no object constants, deter- 
mining 5/(r, y, Pq) is equivalent to the rule reachabil- 
ity problem that has an easy polynomial time solu- 
tion [Kifer, 1988]. [Levy and Sagiv, 1992] gives an al- 
gorithm for detecting SI(r, g y T>o) and S/(r, y, V) even 
when constraint literals are present. It also establishes 
an exponential-time lower bound on the problem of 
determining S/(r, y, X> 0 )- 


Using Irrelevance to Justify 
Abstractions 

Much of the work in AI on creating abstraction hierar- 
chies relies on the intuition that by creating an abstract 
theory we are removing some irrelevant detail. If the 
detail removed is indeed irrelevant, then a solution to 
the problem in the abstract theory will map back to 
a solution in the original theory (also referred to as 
the ground theory). Otherwise, we will have to back- 
track between abstraction levels. Although this has 
been the motivation underlying work on abstractions, 
the formal connection between irrelevance and abstrac- 
tions has received little attention (e.g., [Subramanian, 
1989]). For example we can view predicate abstraction 
as being justified by the irrelevance of a distinction be- 
tween predicates; object aggregation can be justified 
by irrelevance of a granularity distinction. Identifying 
abstraction with the notion of irrelevance offers several 
advantages: 

• We make explicit what is being abstracted (i.e., the 
subject of irrelevance). 

• We make clear the strength of the justification for 
the abstraction (by the strength of the type of irrel- 
evance claim that holds). 

• We formalize the problem of automatically creating 
abstractions by translating it to the problem of au- 
tomatically finding irrelevance claims. 

In this section we briefly discuss how irrelevance 
claims that are justifications for abstractions can be 
formulated in our framework. We identify several ir- 
relevance subjects that account for many abstractions 
discussed in the literature. As a consequence we get 
an expressive language to state knowledge about the 
domain that can affect the creation of abstractions. 
We define a notion of irrelevance that best justifies ab- 
stractions and mention several weaker notions. 

The first assumption underlying a formalization of 
irrelevance is that removing irrelevant detail should 
not enable us to reach new conclusions about the set of 
goals we are interested in, i.e., any conclusion reached 
in the abstract theory should be an abstraction of one 
in the base theory (this is also known as a TD property 
of abstractions [Giunchiglia and Walsh, 1991] or the 
downward solution property [Tenenberg, 1987]). The 
justification for this claim is that by removing irrel- 
evant detail, we are effectively ignoring some of our 
knowledge, and therefore, we can not come to new 
conclusions 7 . For example, when we remove some ir- 
relevant detail in a planning problem (e g., action pre- 
condition), if the resulting abstract plan can not be 
mapped back to a base-level plan, the detail we have 
removed was not truly irrelevant to the problem 8 . Sec- 

7 As long as the our reasoning has no form of non- 
monotonicity. 

ft Note that this does not necessarily mean that the ab- 
straction is not useful! 
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ond, the abstract theory should not prevent us from 
solving the goal, i.e., if the original theory had a so- 
lution to the goal, then the abstract one should too. 
Finally, in order for the abstraction to be computa- 
tionally effective, the solutions that are preserved by 
the abstraction should be the cheaper ones. 

These criteria are naturally formulated in our frame- 
work. Recall that in order to define irrelevance of a 
subject a, we must give a definition for /r(a, D), 

i.e., when the subject a is irrelevant to a derivation D. 
Given a theory A, we denote the abstract theory re- 
sulting from removing the irrelevancy a by f Q ( A). For 
example, if a is a distinction between predicates, / a ( A) 
is the theory resulting from predicate abstraction. The 
exact form of / a (A) is discussed in the next section. 
We base our definition of Ir on a mapping h a from the 
derivations of ip in A, denoted by T >\ , to the derivations 
in f Q { A), 2V The only requirement from h Q 
is that it is onto T> 2 . h a need not be a total mapping on 
Pi, i.e., there might be derivations of \p that will not 
be mapped to the abstract theory, and it need not be 
1-1. Other constraints on h a will yield stronger forms 
of irrelevance and therefore stronger justifications for 
the abstraction, (for example, h a will be called a sim- 
plifytng mapping if for any D e Pi, the cost of h(D) is 
no more than the cost of D 9 ). Given h Q , 7r(a,^,D) 
is defined as follows: 

Definition 12: Ir(a t tp, D) is true iff h Q (D) is not 
empty. | 

Note that in this definition h Q is dependent on a 
and rp. Definitions of weak and strong irrelevance are 
obtained by quantifying the definition of Ir over a cho- 
sen set of derivations. The following states that the 
first two requirements of an abstraction are satisfied 
by weak irrelevance. 

Observation 13: If Pq is a set of derivations in Pi, 
and W/(ar, Pq) holds then 0 is provable from A if 
and only if f(ip) is provable from f a ( A). | 

In order satisfy the third requirement, we must im- 
pose a restriction on P 0 : 

Observation 14: If Po is a set of derivations that 
contains all minimal derivations and h a is a simplifying 
mapping, then if SI( a, 0,P O .) holds, f a (xp) will have a 
solution in the abstract theory if and only if it has one 
in the original theory, and at least one of abstract-level 
solutions will cost no more than that cheapest solution 
in the original theory. | 

This condition is a sound justification for creating 
the abstraction. Imposing more constraints on h Q will 
give us even stronger justifications. For example, we 
can require that h a (D) effectively break up D into 
subproblems of equal size. Knoblock [Knoblock, 1990; 
Knoblock ct a/., 1991] shows how this constraint along 

9 Given some cost model for derivations <ich as the 
number of nodes in the proof tree. 


with other «ffect<a he ability to achieve savings when 
employing -rare il planning. 

Weaker releva; **j claims can also be given to the 
system. For exanw »e, we can state a distinction a\ 
is more relevant than a distinction a 2 , i.e., whenever 
<*1 is justifiably abstracted, so is <* 2 . Another kind 
of claim is one a probabilistic one, i.e., stating to the 
system that in most cases a is irrelevant to xp. The sys- 
tem can then use this claim and succeed in most cases 
and backtrack in others. By stating irrelevance claims 
declaratively we can also state under what conditions 
the relevance claim holds. 

In the next section we examine the case of predicate 
abstractions and show they are justified by irrelevance 
of a predicate distinction. 

Irrelevance of Predicate Distinctions 

When designing a representation, a decision has to be 
nade about the detail with which to conceptualize the 
■vorld. In some cases, identifying a property P (e g., 
Car(x)) will suffice. In other cases we need to refine 
P to subclasses V = {P lf . . . , P n } (e.g., SportsCar(x) } 
F amilyCa^x), etc.) For some goals, the finer distinc- 
tion of properties is irrelevant, and therefore, reason- 
ing will be more efficient if we change the theory by 
abstracting the distinction. We would like to be able 
to give the system knowledge about the domain that 
will guide it in deciding when a predicate distinction 
is relevant. To define the meaning of such an irrele- 
vance claim in the framework, we first must define the 
abstract theory resulting from removing the predicate 
distinction and the mapping of derivations between the 
original and abstract theories. 

The Abstract Theory 

Suppose we have a theory A, consisting of a set of 
predicates V ■ Pi, . . . , P n ) , and we want to abstract 
the distinctir. tween them by replacing them by a 

predicate P ’ represents their union (e.g., we want 

to replace {Fa nlyCar, Sport &Car} by the predicate 
Cor). Intuitively, to abstract the theory A, we re- 
place every occurrence of a predicate in V in every 
formula in A by P (e.g., abstract Fami7yCar(x) => 
Vehiclc(x) by Car(x) => Vehicle(x)). However, doing 
so for every formula in A might result in an inconsis- 
tent theory or in a theory that will entail conclusions 
that were not entailed by the original one. In exam- 
ple 2, abstracting rule r+ will result in a contradiction 
(Car(z) => -*Car(x)), and abstracting r 3 will result in 
a fact that is not entailed by the theory (i.e., Car(x) ^ 
High Risk Insurance(x) does not follow from the the- 
ory). In order to assure that our derivation mapping 
will be onto, we need the abstract theory to be consis- 
tent with the around one. Tenenberg [Tenenberg, 1987; 
Tenenberg, 7 :*’•*] discusses predicate abstractions and 
defines the n y mal set of formulas that can be in- 
cluded in th*‘ 1 >tract theory such that the abstract 
theory will b* insistent with the original one. His 
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definition is based on the interpretation of the abstract 
predicate, which is the union of the interpretations of 
the predicates in V. However, as Tenenberg notes, this 
set is usually infinite even when the ground theory is 
finite. Therefore, the abstract theory we consider is 
a finite subset of the one defined by Tenenberg. Our 
abstract theory consists of the abstractions of the for- 
mulas in the base theory that are independent of the 
predicate distinction. Intuitively, a formula is inde- 
pendent if its abstraction is consistent with the theory. 
In the formal definition, we assume that formulas are 
represented as clauses. A literal in a clause is negative 
if it is a negation of an atomic formula (e.g., -i P(x) 
is a negative literal, while P( x, y) is a positive literal). 
Neg(C) ( Pos(C )) denotes the set of negative (positive) 
literals in a clause C . 

Definition 15 : Independence - Let V — 

A, • • i Pn, and suppose Neg(C)' is the result of substi- 
tuting every occurrence of an element of V in Neg(C) 
by some other predicate in V using a mapping fi . (Two 
occurrences of the same predicate need not have the 
same mapping under f\.) A clause C is independent 
of a predicate distinction *P with respect to a ground 
theory A, if for any such f\ there exists a mapping, 
f 2 of the occurrences of elements of V in Pos(C) to 
elements of V , such that Pos(C)' = /o(Pos(C)) and 

Pos(C)'U Neg(C) no t 

Note, that a clause that contains only positive liter- 
als from P will be independent whenever it is provable 
from the theory. The problem arises with the negative 
literals. In example 2, all rules but 7*3 are independent 
of the distinction {F amilyCar , SporJsCar}. 

Lemma 16: A clause C is independent of a predicate 
distinction V, if and only if f(C) would be included in 
the abstract theory as defined by Tenenberg in [Tenen- 
berg, 1990]. 

The Derivation Mapping 

Given the abstract theory produced by removing the 
predicate distinction, we can define the mapping of 
derivations in the base-theory to those in the abstract 
one. Recall that we require that the mapping be an 
onto mapping. Intuitively, given a derivation in the 
abstract theory, a base-level derivation that is mapped 
to it should be obtainable by reversing the abstraction 
function on the formulas in the derivation. However, 
as the following example shows, this cannot always be 
done. 

Example 17: Consider the following knowledge base: 
r i : Pi(x) => Q(x ) 
r 2 : P 2 (x) => R(x) 

7*3 : P(x) => Pi(x) 

7*4 : P 2 (a) 

10 Notice, that in the definition we use b, which assumes 
a simple case where the base-level reasoner and the meta- 
level reasoner are the same. However, in general, they need 
not be the same. 


Suppose we want to abstract P u P 2 by an abstract 
predicate P. The resulting abstract theory will be: 

: P(x) => Q(x) 
s 2 : R(x) => P(x) 

53 : P(a) 

5 i is included in the abstract theory because ri 
is independent of the predicate distinction (because 
P 2 {x) => Q(r) is derivable from the theory). 

The (single) derivation of Q(a) in the abstract theory 
cannot be trivially mapped to a base-level derivation. 
The reason is that it uses si and S 3 , and they are ab- 
stractions of of r*i and r 4 which do not yield a base 
level derivation of Q(a). | 

The source of the problem is that some reasoning was 
done in the process of creating the abstract theory. In 
this case, s\ already represented a base-level chain of 
reasoning that derived P 2 (x) ^ Q{x). 

Informally, we define the derivation mapping, h Q , 
by specifying all the base-level derivations that map to 
a given abstract-level derivation D. The mapping is 
defined in two steps as follows. Given D, we first con- 
struct all the possible mappings in which occurrences 
of P in D are mapped to elements of V , such that 
the resulting derivation is a valid one. For example, in 
Figure 3, the abstract-level derivation (a) has two such 
possible mappings (b) and (c). Next try to complete 
each of the resulting derivations such that they will be 
valid derivation in the base-level theory. In our exam- 
ple, (b) cannot be completed because Pi (a) does not 
follow from our original theory, (c) however, can be 
completed, as shown in (d). Any such complete base- 
level derivation is mapped to D under the mapping 
h Q . In Figure 3 only (d) is mapped to the abstract 
level derivation (a) (i.e., h a (d) = a). 

In order to show that h a is onto, we must show that 
at least one of the intermediate derivations can be com- 
pleted to a valid derivation from the base-level theory. 

We prove this by defining one mapping M, from the 
occurrences of P in D to V . M will have the property 
that when we apply it to D, the resulting derivation 
is guaranteed to have a completion to a valid base- 
level derivation. Let C be the leaves of the abstract 
level derivation, D that contain the predicate P. We 
define M on the occurrences of P in C such that two 
literals that are resolved somewhere in D are assigned 
the same predicate in V . That ensures that M can be 
extended to all the occurrences of P in D . For clarity, 
we assume that P does not appear in the root of D, 
and that D did not have any non-trivial factoring (see 
[Genesereth and Nilsson, 1987]). We define a partial 
order < on the clauses in C, and make assignments to 
clauses in the topological order induced by <. 

Definition 18: For every C,, Q € C , C, < C ; iff an 
ancestor of C» is resolved with an ancestor of Cj on 
a literal in V , and the ancestor of C{ contributes the 
positive literal to the resolution. | 

Lemma 19: The relation < is acyclic. 
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Q(a) 


P(z)=>Q(x) P{a) P^^Qix) Pi (a) 


(a) 


(b) 


Q(a) 


P 2 (z)^Q(x) P 2 (a) 


(c) 


Q(a) 


P 2 (a) P 2 (x) => Q(x) 


ft(* : => Pi(r) P,(x) =» Q(x) 


P 2 (x) =► P(x) P(x)=> Pi(x) 


(d) 


Figure 3: Mapping base-level derivations to abstract-level derivations 


Also note that if C, is minimal in the order <, (i.e., 
there is no Cj such that Cj < Ci ), then Ci contains 
only positive appearances of P. 

We define M on the occurrences of P in Ci only after 
we have defined the mapping for all its occurrences in 
clauses Cj such that Cj < Ci, as follows: 

• If Ci contains only positive appearances of P, we 
map the occurrences of P such that that the re- 
sulting clause is entailed from the base-level theory. 
Note that by the definition of the abstract theory, 
there must be at least one such mapping for Ci. 

• If Ci contains negative literals of P, we do the follow- 
ing. For any negative occurrence of P, the positive 
literal with which it is resolved in D has been already 
mapped previously (by the definition of <). Hence 
we map it to the same element of V to which its 
counterpart was mapped. As for the positive liter- 
als, any assignment for them such that the resulting 
clause is derivable from the base theory is a valid 
assignment. The definition of the abstract theory 
(i.e., all elements of C are abstractions of indepen- 
dent base-level clauses), guarantees that at least one 
such assignment exists. 

The mapping Af guarantees that every leaf of the 
tree is either in the knowledge base or is derivable from 


it. Therefore, the resulting tree can be completed to a 
full base-level derivation. 

Theorem 20: The derivation mapptng h a is well de- 
fined and nto (i.e., every derivation in the abstract 
theory hr. it least one derivation in the base theory 
that map .j it), and is a simplifying abstraction. 

Properties of the Irrelevance Definition 

Given the definition of irrelevance, the question arises 
whether given the original theory and the abstract one, 
it is possible to decide if the predicate distinction is ir- 
relevant to the goal. The following provides a first step 
in that direction by identifying a class of derivations 
that are preserved by the abstraction. 

Theorem 21: IfV o is a set of derivations of the goal 
such that for any D € Vo, all the facts in Base(D) 
are independent of the predicate distinction V, then 
Sl(V,i>,V 0 ) holds. 

Observation 22: The converse does not hold. I.e., 
i p can have a derivation in the abstract theory, but 
not have one in the base theory only from independent 
facts. Example 17 illustrates that. 11 | 

1 1 Note that if we change the definition of independence to 
require Pos(C)‘ U Neg(C)' € A instead of A h Pos(CY U 
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From this condition and the algorithms described 
in [Levy and Sagiv, 1992] we can construct an algo- 
rithm for detecting irrelevance of predicate distinctions 
in the following case: 

Corollary 23: Given a Datalog theory, A and f v ( A) 
which 15 the abstract theory resulting from removing 
the distinction between predicates in V, there is an al- 
gorithm to determine whether SI(V, 7> 0 ) holds for 
any given set of ground facts, where V 0 is the set of all 
non-redundant derivations of xp from A. 

Note that creating the abstract theory is in general 
undecidable because it entails solving the rule redun- 
dancy problem. Methods for detecting some classes 
of redundant rules (e.g., [Sagiv, 1988]) can be used to 
construct a subset of the theory. 

Other Relevance Subjects 

The same technique described above can be used to 
define irrelevance of other kinds of relevance subjects. 
[Levy, 1992] discusses the following subjects: 

• Object aggregations: We replace a set of object 
constants by an aggregate object. E.g., replace the 
subparts of a component by one object representing 
the component. For example, in the Missionaries 
and Cannibals problem [Amarel, 1981], we can re- 
place the sets of missionaries and cannibals by ob- 
jects denoting their sets. 

• Object distinction: We replace a set of object con- 
stants by a representative object that has only the 
properties common to all elements of the set (i.e., we 
replace a set O = {oi, . . . , o n } by an object o, such 
that P(o) holds iff P(o t ) holds for every Oj £ Q. 
For example, when reasoning about chemical reac- 
tions, it is enough to consider only one representa- 
tive molecule of every type in the chemical formula 
and that suffices to describe the complete reaction 
between the substances. 

• Predicate representative: We replace a set of 
predicates V by an abstract predicate that repre- 
sents their intersection . 

• Macro rule: We replace a set of facts S by a logical 
consequence, s of S. 

Conclusions 

We presented a general formal framework for analyz- 
ing the notion of irrelevance. The framework contains 
a space of possible definitions of irrelevance claims that 
enabled to formalize previous definitions (e.g., [Subra- 
manian, 1989]) and present new ones. We identified 
several important properties of irrelevance claims and 
demonstrated how these properties change as we move 

Neg(C)\ we will get the converse direction too, i.e, if a 
goal has a proof in the abstract theory, it will have one in 
the ground theory in which all facts are independent of the 
predicate distinction. 


in the space of definitions. The framework enabled 
us to irrelevance claims that serve as justifications for 
abstractions, thereby providing a new view on work 
in abstractions. Justifying abstractions by irrelevance 
claims provides a first principles [Subramanian, 1989] 
account of abstractions, elucidating questions such as 
automatically creating abstractions, creating abstrac- 
tions that are specific for a given goal and using domain 
knowledge to guide the creation of abstractions. This 
paper presents only initial work on in this direction 
and much remains to be explored. 
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